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We describe a multidomain spectral-tau method for solving the three-dimensional 
helically reduced wave equation on the type of two-center domain that arises when 
modeling compact binary objects in astrophysical applications. A global two-center 
domain may arise as the union of Cartesian blocks, cylindrical shells, and inner and 
outer spherical shells. For each such subdomain, our key objective is to realize cer- 
tain (differential and multiplication) physical-space operators as matrices acting on 
the corresponding set of modal coefficients. We achieve sparse banded realizations 
through the integration "preconditioning" of Coutsias, Hagstrom, Hesthaven, and 
Torres. Since ours is the first three-dimensional multidomain implementation of the 
technique, we focus on the issue of convergence for the global solver, here the alter- 
nating Schwarz method accelerated by GMRES. Our methods may prove relevant 
for numerical solution of other mixed-type or elliptic problems, and in particular for 
the generation of initial data in general relativity. 
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I. INTRODUCTION AND PRELIMINARIES 
A. Introduction 

This paper describes spectral methods designed with a specific application in mind: nu- 
merical solution of a mixed-type problem arising in gravitational physics. In reviewing an 
ongoing program to construct helically symmetric solutions to the Einstein equations, this 
introduction recalls the origins of this problem below. However, this paper also serves an- 
other purpose; it demonstrates that spectral-tau integration preconditioning yields highly 
accurate numerical solutions to the helically reduced wave equation (HRWE), a mixed- type, 
variable coefficient, linear partial differential equation (PDE) problem, here posed on a 
nontrivial three-dimensional (3D) domain. Ref. [l| offered spectral-tau integration precon- 
ditioning as a general-purpose strategy for spectral approximation of differential equations, 
and that reference provides the most thorough description and analysis of the technique; 
related techniques were explored in j2| (integration postconditioning) and Jl (nodal inte- 
gration preconditioning), with applications described in, for example, Refs. [4], |5[. However, 
heretofore, spectral-tau integration preconditioning has primarily been studied either in the 
ODE context or for PDE problems posed on single and basic two-dimensional (2D) domains 
(rectangles, annuli, and disks), although we have earlier studied a 2D multidomain scenario 
[6] as a warm-up to this work. While the current paper only considers the HRWE, a chal- 
lenging model problem for the aforementioned target application, it shows how to implement 
the technique in a 3D multidomain setting, addressing several key conditioning issues which 
would seem to generically arise in higher dimensional settings. Therefore, our work should 
facilitate the use of spectral-tau integration preconditioning for other elliptic or mixed-type 
PDE problems. We provide more context and a fuller description of these issues below, but 
now turn to the physical problem which has motivated our work. 

The advent of gravitational wave detection has driven theoretical studies of gravitational 
wave sources. A source that is possibly interesting for ground-based detectors, and perhaps 
the most exciting source for space-based detectors, is the inspiral of two comparable mass 
black holes and their merger to form a single black hole. The early stage of inspiral is modeled 
with reasonable accuracy by perturbations of the Newtonian analysis, and the post-merger 
stage can be analyzed with black hole perturbation theory. The most difficult stage to 
analyze is the intermediate stage, when a few orbits remain. This epoch of inspiral is too 
late for a modified Newtonian approach, but too early for black hole perturbation theory. 
Yet this is the epoch in which a large part of the gravitational wave energy is generated. 

The importance of, and difficulty in, analyzing this epoch was the original motivation for 
an innovative approximation, the periodic standing wave (PSW) method. This approach 
uses the fully nonlinear field interactions, but models the binary compact objects to be 
forever on circular orbits of constant radius. Therefore, both the compact source motions 
and the fields exhibit helical symmetry. Not only does this symmetry reduce the number of 
independent variables, it also completely changes the nature of the governing PDEs, turning 
the problem from the hyperbolic evolution of initial data to one of mixed-type that is elliptic 
near a rotation axis and hyperbolic well outside the axis and beyond the orbits in the wave 
zone of the system. More details of this astrophysical background are given in [gj. Here we 
only point out that recent supercomputer evolutions of initial black hole binary data have 

1 We use this term to refer to a specific technique reviewed below; however, insofar as our work is concerned 
the word preconditioning might be a misnomer. In any case, the technique does achieve sparsification, 
and this is the aspect of the technique we focus on here. 
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been run stably for many orbits in the intermediate epoch. See, for example, Refs. [7-23 
(not an exhaustive li st), and 24{ for a recent review. Even the inspiral of binaries with 



large mass ratios [21l. l22j or high spins [23J, both particularly challenging cases, can now be 
computed. To be sure, recent successes with purely hyperbolic numerical evolutions have 
undercut the original motivation for the PSW approximation. Nevertheless, there remains 
a niche for the PSW approximation for the following reasons. First, it should provide a 
test bench for understanding nonlinear gravitational radiation reaction as a local process. 
Second, a helically symmetric solution of the Einstein equations would be, of its own accord, 
of bewitching interest. 

The numerical computation of PSW fields has, in fact, already been carried out, using 
a single grid and a unique method devised especially for the problem by one of us (RHP) 



and coworkers. These computations were done in a series of steps |25H29j moving from 
linearized scalar fields up to and including the nonlinear tensor fields of general relativistic 
gravity. However, the method used proved inherently too limited in accuracy to be useful. 
Furthermore, despite the attractive simplicity of the computational method, its implemen- 
tation for general relativistic tensor fields proved very challenging. The astrophysical PSW 
problem, therefore, can be viewed as not yet solved. The spectral methods described here 
are designed to solve this astrophysical problem to high accuracy. In any case, as mentioned 
above, our work is relevant as a successful use of spectral-tau integration preconditioning for 
the solution of PDEs (even of mixed-type) on nontrivial 3D domains. From this standpoint, 
the astrophysical problem simply provides a convenient application, with a particularly in- 
teresting feature. In the astrophysical problem, the region in which the PDEs are hyperbolic 
—the distant wave zone — is a region in which the PDEs have only very small nonlinearities. 
The strong linearities, near the source objects, are confined to a region in which the PDEs 
are elliptic. While we do not consider nonlinearities in the current paper, the methods we 
introduce for our linear model problem deliver sufficient accuracy that nonlinearities can 
almost surely be included. 

Multidomain spectral methods for the binary inspiral of compact relativistic objects are 
not new. In pioneering work, nodal (i.e. collocation) methods were used by Pfeiffer et al. [30I 



31| for the elliptical problem of constructing binary black hole initial data, and are now being 
used by the Caltech-Cornell-CITA collaboration (see, for example, j20[) in the fundamentally 
hyperbolic evolution problem. This work, now highly developed, relies on SpEC (the Spectral 
Einstein Code (32[), a large C++ project chiefly developed by Pfeiffer, Kidder, and Scheel, 
but also involving many other researchers and developers. One certainly might attack the 
problem we consider with that software; in particular with SpEC's EllipticModule which uses 



finite-difference preconditioning [33J, |34| and is also already configured to solve nonlinear 
problems. Indeed, the EllipticModule has been used to solve the initial value constraint 
equations on essentially the same type of domain we consider below^, and we suspect that 
SpEC could be used to solve our problem to high accuracy. In any case, to date the 3D 
mixed problem considered here has not been numerically solved via spectral methods. 

Our previous study 0] applied a modal multidomain spectral-tau method to a model 
nonlinear problem of two strong field sources in binary motion with only two spatial dimen- 
sions. That study also relied on integration preconditioning, although the relevant linear 
systems were inverted by direct rather than iterative methods (which was possible since the 
2D problem was less memory intensive). Our 2D study, a proof of concept, showed that 
high accuracy could be achieved with relatively modest memory and run-time requirements. 



In fact, the domain decomposition of Pfeiffer et al. 3CJ, |31[ motivated our own choice. 



(a) 3d view of domain decomposition. 



(b) Cross-sectional view. 



FIG. 1. Domain decomposition. The whole inner configuration of 10 subdomains is enclosed 
within an outer spherical shell which is not shown, save for its inner boundary in (b). Our total 
configuration is therefore comprised of 11 subdomains. 



Here we generalize our 2D method to 3D, that is to three spatial dimensions and one time 
dimension, reduced to a problem with three independent variables by the imposition of heli- 
cal symmetry. Due to the larger set of modes needed for the 3D problem, iterative solution 
of the relevant linear system is now necessary. We use the generalized minimum residual 
method (GMRES) [35|, [36f . Since the amount of work and storage per iteration increases 



with the iteration count [35|, |36| , preconditioning is a crucial issue (and here we mean further, 
one might even say genuine, preconditioning on top of the "integration preconditioning"). 
Through a multilevel preconditioning scheme, we will achieve near round-off accuracy for 
large truncations (~ 600,000 unknowns) with modest iteration counts. Moreover, as we 
achieve a sparse formulation of the relevant linear system, each iteration is also fast. 



B. Specification of the problem 

Before writing down our mixed-type PDE problem, we describe the two-center (hereafter 
2-center) domain T> on which the problem is posed, first recalling the coordinate conventions 
of j26[. Let {x,y,z} represent the inertial Cartesian system related to the spherical-polar 
system {r, 6, <$} in the usual physicist's convention (9 and </> are respectively the polar and 
azimuthal angles). We then introduce a "comoving" Cartesian system 

z = rcos8, x = r sin 9 cos(0 — Qt), y = rsin6 l sin(0 — Qt), (1) 

where Q < 1 is a fixed rotation rate. Note that the z and z-axes coincide, and both are the 
azimuthal axis. Via a simple permutation, we then define a new comoving system 

X = y, Y = z, Z = x, (2) 

for which the Z-axis is not the azimuthal axis. If we imagine two compact objects with "cen- 
ters" located at ^(t) = a\ cos{VLt)e x + a\ sin(fit)e^ and £ 2 W = ~ a 2 cos(f2t)e x — a 2 sin(f2£)e y 
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Spherical shells 


J 


0.4 < r < 1.1 


< 9 < 2tt 


< 4> < it 


H 


0.3 < r < 1.1 


< 9 < 2tt 


< tj> < TT 


O 


2.0 < r < 150.0 


< 9 < 2tt 


< <p < TT 


Cylindrical shells 


1 


0.452 < p < 2.120 


< <f> < 2tt 


-2.120 < Z < -1.525 


2 


0.452 < p < 2.120 


< < 2tt 


-1.525 < Z < -0.275 


3 


0.452 < p < 2.120 


< <f> < 2tt 


-0.275 < Z < +0.375 


4 


0.452 < p < 2.120 


< <f> < 2tt 


+0.375 < Z < +1.625 


5 


0.452 < p < 2.120 


< </>< 27T 


+1.625 < Z < +2.120 


Blocks 


B 


-0.640 < X < 0.640 


-0.640 < K < 0.640 


-2.120 < Z < -1.525 


C 


-0.640 < X < 0.640 


-0.640 < Y < 0.640 


-0.275 < Z < +0.375 


D 


-0.640 < X < 0.640 


-0.640 < Y < 0.640 


+1.625 < Z < +2.120 



TABLE I. Particular domain decomposition. The inner shells J and H are centered at (X, Y, Z) = 
(0,0,-0.9) and (X,Y,Z) = (0,0,1.0), and for each shell the polar system (r,6,cp) is relative to 
the Cartesian system arising from translation of the {X, Y, Z} system to the shell's corresponding 
origin. For each cylinder, the cylindrical system (p, <f), Z) has the standard relationship with the 
{X,Y,Z} Cartesian system. Finally, the outer shell is O centered at the origin of the {X,Y, Z} 
system, but now the (r, 9, (p) system is relative to the {x, y, z] system. 



in the inertial {x, y, z} system, then the Z-axis connects those compact objects. That is, 
£ x = aie^ and £ 2 = ~ ^e^. We introduce the coordinates {f, 9, tp} as spherical coordi- 
nates relative to the comoving {x, y, z} system. We will exclusively work with the comoving 
systems (or simple translations or polar versions thereof), but we will often suppress tildes 
when doing so will not cause confusion. We will, for example, use {r, 9,<p}, hereafter, to 
mean {f, 9, (p}; these coordinates should not be confused with {r, 9, 0} of Eq. (pQ). 

Relative to the system {X, Y, Z}, the 2-center domain D that we have used is depicted 
in Fig. [TJ Topologically, the domain D is a large solid 3D ball with two excised regions 
(each a smaller solid 3D ball). The global domain T> has been broken into 11 subdomains, 
each sufficiently simple to allow for spectral expansions in terms of classical basis functions. 
A large outer shell (labeled O for "out") is not shown in Fig. [Tl. but the remaining 10 
subdomains which comprise the "inner region" are shown. The inner region is comprised of 
two "inner shells" (spherical shells labeled J and H), three "blocks" (rectangular subdomains 
labeled B, C, and D), and five "cylinders" (cylindrical shells labeled 1, 2, 3, 4, and 5). Tabled 
lists the parameters which specify the subdomains comprising V, along with the numerical 
values we have used in the computations to be reported below. Its caption describes the 
relationship between the parameters and the appropriate comoving system. 

The HRWE problem we consider is as follows: 

Lip = g on V, ip = h~ on dH' U d J~ , ( ^- - Q^- + - ] i/j = h + on dO + , (3) 

\or d(p r J 

where the denning operator is 

T d 2 d 2 d 2 „ 2 [„d „d\ 2 d 2 d 2 d 2 „ 2 f~d ~d\ 2 

L = ^ + ^ + T^-tt 2 (x—-y—) =^^ + ^ + ^-tt 2 [Z^-X^ \ . 
dx 2 dy 2 dx 2 \ dy dx J dX 2 dY 2 dZ 2 \ dX dZ J 

(4) 
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Here the constant f2 is the rotation rate, and g is a prescribed source. As described in, for 
example, jsj this problem arises via a helically reduction of the inhomogeneous 3+1 wave 
equation (see also the Appendix). Notice that the problem includes Dirichlet conditions set 
on the inner boundaries of the spherical shells J and H. The boundary condition set on the 
outer boundary of the spherical shell O is of radiative type, and is here expressed in terms 
of the polar coordinates {r, relative to {x,y,z}. Although this radiation condition 
is described precisely below, it may here be thought of as an inhomogeneous Sommerfeld 
condition. (The inhomogeneity h + in fl3]) is a nonlocal expression.) This paper will consider 
only the variable-coefficient linear problem ([3]). For numerical tests, g is taken as zero on 
V, but with distributional support, point sources, located at the centers, £ x and £ 2 ; °f J 
and H . For this choice of g, an exact solution is described in the Appendix. While we 
only consider the linear scalar problem fl3]), the helical reduction of the Einstein equations 
described in 27, involves a tensorial field resolved into ten coupled "helical scalars" 



?/>( a/3 ) each of which obeys a copy L^ a ^ = g^ a ^ of the above equation. However, for this 
formulation g( a/3 ^ is now not an external source, but rather is a nonlinear coupling function 
of the helical scalars built with lower-order terms including some first derivatives. Therefore, 
clearly solving the linear problem that we consider is the first step towards considering the 
helically reduced Einstein equations. 



C. Overview of 3D spectral-tau integration "preconditioning" 



Mostly focusing on the 3D HWRE in three Cartesian variables on a rectangular block, this 
subsection gives a short overview of integration preconditioning for spectral-tau methods, 
in particular focusing on the Kronecker product representations necessary for 3D. We hope 
that this overview will provide the reader with enough context to follow the heavy details 
encountered later when applying the technique on 3D spherical and cylindrical shells. Our 
earlier paper [f| gave a fuller account of essentially the same issues for 2D, many of which 
change little in going to 3D. Therefore, in order to here avoid a prohibitively long discussion, 
we have opted for a short overview, and one tied to our particular problem, pointing the 
reader to [6( for more details. 

The following overview makes use of matrices D k and B^. These respectively represent 
fcth order differentiation and mth order integration with respect to a basis of Chebyshev 
polynomials. As explained later, the subscript [n] indicates that the first n rows of a matrix 
are empty. We do not here provide precise expressions for D k and BVS; however, we list the 
following key properties exploited later: (i) D k is dense upper triangular, (ii) BV\ is sparse 
and banded with upper and lower bandwidth n, and (iii) B^D k = B?7 k for n > k. Here 
B^ = Jr n ] is the identity matrix, except that each entry in its first n rows is a 0. Our earlier 
paper [6( gave the precise expressions for D = D 1 , D 2 , B^ = Bh, and B? 2 ,; our treatment 
of the 3D HRWE (a second order equation) only requires these matrices. That reference 
also discusses the necessary rescalings of these matrices for work with an arbitrary interval 
rather than the standard interval [—1, 1] for Chebyshev polynomials. Of course, Ref. |l( a l so 
considered such expressions and identities, even for more general basis functions. 
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1. Direct product representations 

A function on any of our 3D subdomains is encoded by the modal coefficients for its spec- 
tral expansion, and this set of modal coefficients is often here viewed as a direct (Kronecker) 
product. For example, let us consider a rectangular block delineated by the above comoving 
coordinates {X,Y,Z}, but for the rest of this overview let us suppress the tildes on these 
coordinates. Suppose a function ip(X, Y, Z) on the block is formally represented as a triple 
Chebyshev expansion 

oo oo oo 
n=0 m=0 p=0 

where (£(X),r)(Y),x(Z)) maps our block to the standard cube [—1, f] 3 . To get an approxi- 
mation of ip(X, Y, Z), we consider the truncated series 

N x N Y N z 

V Nx , Ny , Nz iP(X,Y,Z) =^^^ nrnp T n (^X))T m (r){Y))T p { X {Z)), (6) 

n=0 m=0 p=0 

so that ip(X, Y,Z) is represented (either exactly or approximately) by a three-index set 
{ipnmp '■ < n < Nx,0 <m< Ny,0 < p < Nz} of modal coefficients. We represent this 
finite collection of modal coefficients as a column vector ip, with components ?/>(«) = (V')q 
determined by the direct product representational 

${n{N Y + 1)(N Z + 1) + m(N z + 1) + p) = $ nmp . (7) 

A single matrix operating on the vector i/> (all modal coefficients representing the given 
function) may then equivalently be considered as having block-elements which are other 
matrices. We always view the modal set for a function on a cylindrical or rectangular 
subdomain as a direct product of three one-dimensional sets. However, in the case of the 
spherical shells (J, H, and O), we sometimes view the set of modal coefficients as the direct 
product of only two sets, the set corresponding to the radial modes and the set corresponding 
to the spherical harmonic modes (which involve both the polar and azimuthal angles). 

In our notation, operators corresponding to a single dimension, that is "simple" matrices 
(whose elements are numbers, not matrices), are usually represented by a capital in an 
ordinary font, such as the identity operator/matrix Ix or the matrix Dz which realizes 
differentiation by Z. Matrices which act on the full set of modal coefficients are represented 
by a calligraphic capital, for example B. Thus, if -Bz[i] represents integration in Z, then on a 
rectangular subdomain we might have B = Ix®Iy®Bz[\] as the matrix which, when applied 
to a vector i\) holding the full set of modal coefficients, realizes integration in Z with no action 
in X or Y. That is, if ip(X, Y, Z) has modal coefficients ?/?, then formally J ip(X,Y, Z)dZ 
has modal coefficients Bi\). The [1] on Bz[\\ indicates that all entries in the first row (in fact, 

3 We could have instead taken 

i>{m{N x + l)(N z + l)+p(N x + 1) +n) = Vw P , 

which might prove advantageous for representation of the VI 2 term in ©. Our choice (0 has been 
determined by technical decisions made during the initial construction of our code. In any case, based 
on some experimentation, we believe this choice makes little difference, at least for Q < 0.5 (well in the 
range of rotation rates we aspire to treat). 
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the zeroth row by our conventions) of this matrix are zero, so that (Bif))(a) = whenever 
p = for the index a = n(N Y + l)(Nz + 1) + m( y N z + l)+p [cf. Eq. (J7|)]. This choice would 
fix the integration constant (a function of X and Y) in j if)(X, Y, Z)dZ, but this empty row 
might also be subsequently filled with a "tau-condition," that is another vector chosen to 
fix a different constant. 

2. Integration preconditioning 

Let us briefly review the key ideas behind the technique of integration preconditioning, 
continuing to assume a rectangular block subdomain and also assuming the operator (j4]) for 
the HRWE. (See Refs. 0, Q for nonlinear scenarios, |l[ for more complicated ID operators, 
and Refs. [H, [37[ for more exotic basis functions). After enough invocations of the Leibniz 
rule, we may express the operator in flU) (again with tildes suppressed) as 

L = d Y + d 2 x (l - Vl 2 Z 2 ) + d 2 z (l - Vl 2 X 2 ) - n 2 (d x X + d z Z - 2d x d z XZ). (8) 

We view this equation as an operator identity, that is the partial derivatives see both terms 
like Z 2 and XZ to the right, and also the function (not shown) on which L will eventually 
act. The Chebyshev polynomials T n (£) obey the three-term recurrence 2£T n (£) = T n+ i(£) + 
T n -i{C,)- Here £ may be viewed as a suitable rescaling of either X, Y, or Z. Therefore, 
multiplication by the independent variable (here £) is represented in the corresponding space 
of modal coefficients by a banded (evidently a tridiagonal) matrix A%. In fact, multiplication 
by a polynomial p(£) is similarly represented by a banded matrix p(A^). Now, the matrix 
which represents L is 

C = I x <g> D 2 Y ® I z + D\ <8> I Y ® (I z - tt 2 A 2 z ) + (I x - Vl 2 A 2 x ) ® I Y ® D% 

- tt 2 (D x A x ®I y ®Iz + Ix®Iy® D z A z - 2D X A X ® J y ® .D Z A Z ). (9) 

where each Z) represents differentiation in the space of modal coefficients for one coordinate. 
This matrix is of the general form 

222 / 2 2 2 \ 

i=0 j=0 fc=0 \r=0 s=0 t=Q / 

where the Pijk,rst are constants (most zero in our case). In Eq. fflOl) the matrix within the 
parenthesis is banded and sparse; however, overall £ is neither, since these desirable features 
are spoiled by the derivative matrices (see the second paragraph of this subsection). 

The idea behind integration preconditioning is to "undo" all of the matrix differentiations 
which appear in ffTU]) through repeated application of integration matrices [cf. point (iii) in 
the second paragraph of this subsection]. To illustrate, we consider the modal representation 
Cif) = g of ([3]) on the rectangular block, ignoring for the time being the issue of boundary 
conditions. Introducing B = -BjU 2 i ®B Y , 2 , ®B 2 Z < 2 ^ we then form B£ij> = Bg. The coefficient 
matrix of the new "preconditioned" system is then 

BC = B 2 X[2] ® I Y [2] ® B 2 Z[2] 

+ I x[2] ® B 2 Y[2] ® {B 2 Z[2] - n 2 B 2 z[2] A 2 z ) + {B 2 X[2] - n 2 B 2 x[2] A 2 x ) ® B 2 [2] ® I z[2] 

- n 2 (B x[2] A x ® B 2 Y[2] ® B 2 Z[2] + B 2 X[2] ® B 2 Y[2] ® B Z[2] A Z - 2B X[2] A X <g> Sf- [2] ® B Z[2] A Z ). 

(11) 
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£ace 


Rows 


Index restrictions 


Z — Z m [ n 

^ ^max 


n(N Y + l)(N z + 1) + m(N z + 1) + 

^ t at i iw?\r i 1^ i ™/Ar i 1 \ i i 
n(I\ Y + iJl-'^z + J- J + rn{l\z + L) + L 


< n < N x , 0<m<N Y 

n AT n m-. AT 

U S n S -'"X) U S S ivy 


Y = Yrni n 

Y — Y 

± 1 max 


n(N Y + l)(N z + l)+p 

n(N Y + l)(N z + 1) + (N z + 1) +p 


< ra < iVx, 2<p<N z 
< n < iVx, 2<p<N z 


X — X m [ n 
X — ^max 


m(iV z + 1) +p 

(iVy + 1)(JV Z + 1) + m(N z + l)+p 


2 < m < TVy, 2<p<N z 
2<m< Ny, 2<p<N z 



TABLE II. Filling of empty rows for blocks. 



Because it is built only with Bs and As, this matrix is sparse and banded, albeit with large 
bandwidth due to the direct product structure. 

The matrix EC has many empty rows, signaling missing information. The spectral-tau 
procedure is to put the "tau conditions," here the boundary conditions, in these empty rows, 
and the corresponding inhomogeneous values in Bg. When this procedure is carried out cor- 
rectly, with due regard to possible repetition in the specification of boundary data, the empty 
rows provide precisely the space needed for the boundary data of a well-posed problem. To 
enforce boundary conditions for the example at hand, we proceed as follows. Define, for ex- 
ample, h + (X, Y) = ip(X, Y, Z max ) and h~(X,Y) = ip(X,Y, Z min ). Then Dirichlet boundary 
conditions along the XF-faces of a block are expressible as 

N z 

tpnmpSp = h nm , (12) 

p=0 

where a double Chebyshev projection appears on the right-hand side. Moreover, 5 + (all l's) 
and S~ (alternating +1 and —1) are the (N z + 1) dimensional "Dirichlet vectors." Similar 
equations correspond to YZ and XY faces, and in all 2(N X + l)(N Y + 1) + 2(N Y + 1)(N Z + 
1) + 2(Nx + l)(Nz + 1) such equations are possible. However, there are only 

2(N X + 1)(N Y + 1) + 2(N Y + 1)(N Z + 1) + 2(N X + l)(N z + 1) - A(N X + N Y + N z + 1) 

available empty rows in ffTTl) . However, there are precisely 4:(N X + N Y + N z + 1) linear 
dependencies amongst the set of all possible boundary equations, owing to the fact that 
faces share common edge values. Table [TT] gives our prescription for filling empty rows. 

As a result of the integration preconditioning, we have reformulated the set of equations 
in terms of matrices with a drastic reduction in nonzero elements. In the context of ODEs, 
that is in the ID origins of this method, Ref. [l| has thoroughly studied the condition num- 
ber of the resulting preconditioned matrix with respect to norms that arise from diagonal 
equilibration. While in ODE settings integration preconditioning often improves the condi- 
tioning of the original system, in the PDE context at hand B x ^ £g> B Y ^ <g> B 2 Z ^ is not an 
optimal preconditioner in that it does not approximate (in any measure that we are aware 
of) the inverse of the original coefficient matrix. Nevertheless, one should expect that the 
"preconditioned" coefficient matrix has a more clustered spectrum, since the Bs are com- 
pact operators (even as infinite dimensional matrices). A clustered spectrum often yields 
favorable convergence properties in the context of Krylov iterative methods. Regardless, 
sparsification is a desirable property, since it clearly affords a fast matrix-vector multiply 
in Krylov methods. Therefore, for multidimensional problems we are more comfortable 
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focusing on the sparsifying aspect of the technique, with the understanding that further pre- 
conditioning (described below) on top of the "integration preconditioning" will be required 
to enhance convergence of the underlying linear solver (in our case GMRES). 

II. SPARSE SPECTRAL APPROXIMATION OF THE 3D HRWE 

Starting with one of the forms appearing in (]3J) and after further transformations, this 
section describes the spectral-tau representation of L on each of the basic subdomains, except 
for the block case which we have already described in Sec. II CI These descriptions allow for 
implementation of the lefthand side of ()3]) as a "matrix- vector multiply," an implementation 
required by the iterative solver GMRES j3o| . 

A. Outer spherical shell 

In the polar system associated with the comoving system ([[]) the operator fl4]) becomes 

r 2 L = r 2 A-n 2 J , r 2 A = d 2 r r 2 - 23 r r + A S 2, J Q = r 2 <9j, (13) 

where A^2 is the unit-sphere Laplacian and O stands for the outer spherical shell. These 
equations should be viewed as operator identities acting on scalar functions. The solution 
ip to Lip = g is formally represented as the triple series 

oo oo 

i>(r,6,<p) = ^^^ nP£o(cos^)T n (eW) 
n=0 e=o 

oo oo £ 

+ 5J^^-P£m(cos0)[^ )2 m-i,nCos(m^) + V^m.n sin(m</?)] T n (£(r)) , (14) 

71=0 1=1 m=l 

where the Pe m {cos6) are normalized associated Legendre functions [38] and £(r) maps the 
radial domain to the standard interval [—1, 1]. The corresponding numerical approximation 
is the following finite expansion: 

N r Ng 

V Nr , Ng ?p(r,6,Lp) = ^2^2ip£o n P£o(cos9)T n (£(r)) 
n=0 e=o 

N r Ng N 

+ ^ ^2 ^2 p £rn(cos9) [ipe,2m-i,n cos(nnp) + ^)£.2 m ,n sin(myj)] T„(£(r)). 

n=0 £=1 m=l 

(15) 

We represent the triply- indexed modal coefficients ipe qn as a vector ip{a) of length (Ng + 
1)(2N$ + l)(N r + 1), with the two notations connected by 

^{l{2N e + l)(jV r + 1) + q{N r + 1) + n) = $ lqn , (16) 

For £ < Ng the second sum over m in ffl5l) includes too many terms. Indeed, m should run 
from 1 to £ only (with the m = terms appearing in the first sum); therefore, whenever 
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q > 2£, we must set ipe qn = by hand. We have enlarged the space of modal coefficients 
for later convenience when using spherical harmonic transformations. With this remark in 
mind, for our representation ( fl6l) the index a of the vector ip(a) starts at and takes on 
all values corresponding to the ranges O<£<iV0,O<g< 2Ng, and < n < N r . We 
denote by P the projection matrix whose range is the set of vectors associated with proper 
spherical harmonic expansions, 

(P^)(£(2Ng + l)(N r + 1) + q{N r + 1) + n) = 0, for q > 21 (17) 

Let us first consider a sparse approximation of the Laplacian term r 2 A, which from f JT3|) 
has the spectral representation 

A 2 r A = P[l e ® I v ® (D 2 A 2 r - 2D r A r ) - % 2 <g> I r ], (18) 

where A 2 . = Ig <S> I v <S> A 2 , and A r is the matrix equivalent to multiplying r-dependent 
functions by a factor of r. In the first term within the square brackets Ig ® I v ® means that 
there are no operations mixing modes ipiqn with different values of £, or of q (i.e. the dual 
indices to 9 and <fi). The operator (D 2 A 2 — 2D r A r ) is the matrix equivalent of the partial 
differentiation d 2 r 2 — 2d r r in ( TT3|) . The matrix Jzf 2 , representing — A S 2 in (fl3|) . is comprised 
of (Ng + 1) constant blocks £(£ + l)I(2N e +i)x(2N e +i) i n eacn subspace labeled by £. 

To get a sparse form of the Laplacian, we define B = Ig^iI^^B 2 and write the expression 

(&4 2 A) modificd = ¥[l e (8) J v ® (I r[ 2]A 2 " 25 r[2] A r ) + if 2 <g> 5 2 [2] ] + (/„ g) ® J r - P) . (19) 

Here the "modified" notation indicates that, by the addition of the last term above, l's have 
been placed on the diagonal in rows set to zero by the projection operation, so that the 
result is a nonsingular matrix. Therefore, to ensure that a solution if) to the corresponding 
linear system obeys 

i>(£(2Ng + l)(N r + l) + q(N r + l) + n) = 0, forg>2^. (20) 

We demand that the source obeys g = Wg. Finally, from f[T3j) the sparse preconditioned 
form of the operator Jo is 

BJo = -P[/ fl ®Jt 2 ® B 2 r[2] A 2 ] , (21) 

where Jt 2 = diag(0, 1, 1, 4, 4, • • • , iV|, iVf ) is the (2N g + l)-by-(27V fl + 1) matrix representing 
— <9 2 . Therefore, (BA 2 A) modlficd — Vt 2 BJo is our sparse form of the overall coefficient matrix, 
prior to inclusion of boundary conditions. 

We now consider specification of outer radiation conditions, for which we summarize 
results given in [6j. Specification of Dirichlet conditions on the inner boundary dO~ of the 
outer shell O is essentially the same as specification on the boundaries dH^ of the inner shell 
H, and we describe that specification in detail below. The specification at dO + , however, 
involves radiation conditions. To define these, we set R = r max , the radial coordinate value 
of dO + , and introduce 

vWO = /^ ex P [ - - 5™ - HK+Wo, (22) 
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which satisfies V^ + i/ 2 (£) ~ 1 as £ — >■ oo. Here Hg^\, 2 
of the first kind, of half-integer order 
need the "frequency-domain kernel," 



(£) is the cylindrical Hankel function 
+ 1/2. For our radiative boundary condition we will 



1^+1/2 (0 = f 



^Vl/2(0 
^+1/2 (0 ' 



(23) 



Radiation conditions 



which is computable as a continued fraction via Steed's algorithm [39 
involve 

p = mQR + Im (i7£ +1/ / 2 (mf2i?)) , q = 1 — Re (i^ + i/ 2 (mf2.R)) , (24) 

with p = and g = £ + 1 for m = modes (see Ref. j(3j for details). The p and q here (in 
particular the q) are not related to the indices on ipi qn . Both uses of p and q will not appear 
in the same formula. As tau-conditions, our radiation conditions are then expressible as 



N r 



n=0 



(25a) 
(25b) 



n=0 



Here <5 + (all l's) and 8~ (alternating +1 and —1) are the (N r + 1) dimensional "Dirichlet 
vectors" used to impose Dirichlet conditions at the endpoints of a coordinate range. Simi- 
larly, v + and v~ are the (N r + 1) dimensional "Neumann vectors" used to impose derivative 
conditions at the endpoints. Details are given in [3, 0] . 

Along the block-diagonal of the coefficient matrix (i3^A) modlficd — Q 2 BJ'o, there are 
(N r + l)-by-(N r + 1) blocks, one for each (£, q) pair. When q exceeds 2£, each such block is 
the identity matrix; however, the block corresponding to a physical mode < q < 2£ has 
the form 



. (26) 



Here represents a row of zeros, and B iq is a nonzero (iV r — l)-by-(iV r + 1) submatrix. 
The zeros in the first two rows are filled in with the Dirichlet boundary conditions on dO~, 
using 5~, and the radiation boundary conditions on dO + , using (125]) . Since these radiation 
conditions mix one cosine (q = 2m — 1) and the other sine (q = 2m) mode, the tau conditions 
lead to a coupling between among the blocks. The resulting 2(N r + l)-by-2(iV r + 1) block 
neighborhood, with Dirichlet and radiation boundary conditions, takes one of the following 
forms (either representation is possible due to the homogeneity of the boundary conditions): 



6- 





P 5+ 


Ru+ + q5 + 


Qi,2m-1 








5~ 


Ru + + q5 + 


- P 5+ 





Q£,2m 



or 



6- 





Ru + + q5 + 


— p5~ 


g£,2m-l 








5- 


P 5 + 


Ru+ + q5+ 





g£,2m 



(27) 
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where represents either a row (when opposite a 5~) or a (N r — l)-by-(N r + 1) submatrix 
of zeros (when opposite a B). Boundary conditions for m = (zero modes) correspond to 
blocks 

6- 

Ru+ + qS+ 
B w 

Evidently, in this case only a single azimuthal block need be considered. 



(28) 



B. Inner spherical shells 

We stress that the polar coordinates (r, 9, 0) appearing in this subsection are not the 
polar coordinates (r, 9, <p) used in the last, although only the notation for the azimuthal 
angle {(f) vs. (p) reflects the difference. We start with 01]), assume that one of the "holes" is 
at Z = zh, and dehne new comoving coordinates 

z = Z-z H , x = X, y = Y. (29) 
The helically reduced wave operator in the new coordinates is 



a 2 d 2 d 2 



-\- — — — — f2 



dx 2 ' dy 2 dz 2 



^ + Z) dx- X dz- 



1 2 



(30) 



Spherical polar coordinates {r,9,(p} in this subsection correspond to the system {x,y,z}. 

As already mentioned, the system {r, 9, 0} is not the system {r, 9, cp} corresponding the 
outer shell. Nevertheless, for an inner shell our treatment of the Laplacian part of the 
operator is the same as the treatment given in the last subsection. In particular, we adopt 
the same conventions for the indexing of the spectral representation, and therefore again 
arrive at the expression (fl9"|) . Notationally, the only difference is that we replace all instances 
of tp with (/>. Therefore, having already considered (r 2 x) the Laplacian part of the HRWE, 
we turn to (r 2 x) the term in (1301) paired with — Q 2 , 

J H = r 2 [(z H + z)d/dx - xd/dz} 2 . (31) 

To facilitate the expression of the derivatives in ( 13~T|) in terms of operations on {r, 9,4>}, 
we introduce 

Q = sin 9 cos <fi (32) 

P = cos 9 cos 4>d/d9 — esc 9 sin (fid/dcf) (33) 

N = cos 4>d/d9 — cos 9 esc 9 sin <pd/ <90, (34) 

and note that 

d/dx = Qd/dr + r _1 P, zd/dx — xd/dz = N . (35) 
With the identities (which should be read as operators acting on a scalar function) 

r 2 d 2 = d 2 r 2 - Ad r r + 2, rd r = d r r - 1, r 2 d r = d r r 2 - 2r, (36) 
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we then find that Jh of (l3Tj) can be written as 

Jh = z 2 H Q 2 8y + z 2 H {PQ + QP- 4Q 2 )<9 r r + z H (NQ + QN)d r r 2 

+ z H (NP + PN- 2NQ - 2QN)r + N 2 r 2 (37) 
+ z 2 H (P 2 + 2Q 2 -PQ-2QP). 

We use Jh to denote the spectral form of the differential operator Jh- The corresponding 
sparse form BJ H = (J e ® 1$® B 2 ^)J H is then 

= 4Q 2 g) I [2]r A 2 + 4(PQ + QP - 4Q 2 ) g> 5 [2]r A (38) 
+ ^(NQ + QN) ® E [2]r v4 2 + z H (NP + PN - 2(NQ + QN)) <g> B 2 2]r A r 
+ N 2 <g> B 2 2 yA 2 r + 4(P - 2Q)(P - Q) ® S| ]r . 

Here the san serif N, P, and Q are matrices acting on the spectral space of spherical harmonic 
expansion coefficients. Whence we need explicit realizations of the following matrices: Q 2 , 
PQ + QP, PN + NP, PQ + QP, and (P — Q) 2 . We compute these matrices as truncations 
of the corresponding exact infinite dimensional matrices described below (with products 
computed before truncation). The truncated matrix components N(a,/3) of N obey the 
following condition: 

N(£(2iV + 1) + q,k(2N e + 1) +p) = 0, for q > 2i or p > 2k, (39) 
and similarly for the components P(a, 0) and Q(a, 0). (Here we have switched to parenthesis 



my 

notation 36| for the components N(i, j) = Njj of a matrix.) This condition properly treats 
the extraneous components we have included in our expansion vector ip. 

Of the three angular differential operators P,Q,N, we only consider N in detail here, as 
its action on spherical harmonics is the simplest to describe. Partial formulas are given for 
P and Q at the end of this subsection. Using standard formulas from the theory of angular 



momentum (see the appendix of [40(]), we have 

NY im = ~ ^(£- m )(e + m + l)Y e>m+1 - ~ v /(£ + m )(£-m + l)r,, m _ 1 . (40) 

Before completing our construction of N, P, and Q, we first collect some formulas which 
relate the standard complex representation of spherical harmonics Ye m (6, <ft) to the real- 



valued representation. The normalized Legendre functions are [38 



7 -<"> = <" 1 >V ?? r 1 fe3 ,m (41) 

with Pp(u) the standard associated Legendre function (as given, for example, by Thorne 
[HI). We then have 

Y em = x [^-(-l) m P em e im t Y^_ m = J^-P tm e- lm \ m > 0. (42) 
V 2n V 2n 

For fixed £, the expansion in azimuthal index takes the form 

e e 
c eo Y eo + ( c ernY(>rn + Q,-m^,-m) = a eo P m + 2J Pem cos m<\) + b im sinm0] , (43) 

m=l m=l 



15 



where the real expansion coefficients are \/27iaio — Qo an d, for m > 1, 

V2na em = Q m (-l) m + q _ m , \p2^h m = \[c lm {-l) m - ct- m ]- (44) 
We define another set of complex expansion coefficients 



/*m = ^a/ + - m + i)c£ >m -i - ^-y/(£ - m)(£ + m + l)c£ >m+1 , (45) 
so that, from fT40l) . the action of N has the effect 

oo I oo I 

1=0 m=-£ 1=0 m=-£ 

We can then represent N by the matrix N that converts the vector of coefficients Q m to the 
vector fe m by / = Nc. We also define real coefficients di m , ee m which are related to fe m in 
the same way that a^ m) b^ m are related to Q m , and then view the action of N as a mapping 
from the real coefficients a^ m , bi m to the real coefficients d,£ m , ti m . 

Turning to the representation for Q, we likewise use results tabulated in the appendix of 
13 to find 



~ 2 V (2£+l)(2£ + 3) Fm ' m - 1 (47) 



1 /( l + m + l)(l + m + 2) 
(2£+l)(2£ + 3) 



1 /(£ + m)(£ + m-l) 

(2£ + 1)(2£- 1) ^ 1 ' m " 1 



i l (e-m)(e-m-i) v 

+ 2V (2£ + 1)(2£- 1) 



so that / = Qc is determined by 



fim ~ 2 V (2£-l)(2£+l) Q ^ m+1 (48) 



1 l(£ + m-l)(£ + m) 

(2£-l)(2£+l) Q - 1,m - 1 

1 /(£ + m + 2)(£ + m + 1) 

2 V (2£+l)(2£ + 3) Q+1 ' m+1 



1 /(£-m + 2)(£-m + l) 
+ 2 V (2£+l)(2£ + 3) 
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Boundary 


Rows 


Index restrictions 


' ''max 


£(2N e + l)(N r + 1) + q{N r + 1) + 
£{2N e + l){N r + 1) + q(N r + 1) + 1 


< I < Ng, 0<q<2£ 
0<£<N e , 0<q<2£ 



TABLE III. Filling of empty rows for shells. 

Again, we may express the action of Q as a mapping from a^ m , bi m to dg m , e^ m . Finally, we 
use the identity 



PY f 



im 



1 

+ 2< 



(£-m + l)(£-m + 2) 
(2£ + l)(2£ + 3) 

'(£ + m + l)(£ + m + 2) 
(2£ + l)(2£ + 3) 



(49) 



m+l 



2 {£ + 1) \l (2£ + l)(2£-l) y/ " 1 ' m - 1 



, / (*-™)(*-™-i) y 

+ l)y ( 2 £+l)(2£-l) r ^ m+1 ' 

to similarly define the action of P as a mapping from a^ m , be m to di m , e^ m . 

To enforce the inner and outer boundary conditions in ([3]), we fill empty rows in 
(BA 2 A) modmcd -tt 2 BJ H with the tau-conditions. Let h + {6, 0) = ip(r max , 9, 0) and h~(9, 0) = 
V , ( r mia) 0)- Then Dirichlet boundary conditions on the inner and outer boundaries of the 
shell are expressible as 



N r 



(50) 



ra=0 



where spherical-harmonic projections appear on the righthand side. Table IHII shows how 
empty rows are filled to enforce these boundary conditions. 



C. Cylindrical shells 

Throughout this section we suppress the tildes on X,Y,Z. Let p = (X 2 + Y 2 ) 1 ^ 2 , and 
multiply Eq. (j4|) by p 2 to get the operator identity 

p 2 L = p 2 [dl + d 2 x + d%- n 2 d 2 x Z 2 - n 2 d 2 z X 2 - n 2 (d x X + d z Z - 2d x X ■ d z Z)] . (51) 

Since X = p cos and Y = p sin 0, 

p 2 (d 2 x + d 2 Y + d 2 z ) = d 2 p p 2 - 3d pP + 1 + 01 + p 2 d 2 z , (52) 

again with the view that this is an operator identity. Eq. (15 2p has the spectral representation 

A 2 p A = 1+ <g> (I, + D 2 p A 2 p - 3D p A p ) <g> I z + D 2 ® I p ® I z + J ® A 2 p ® D 2 . (53) 

Here A 2 represents the matrix 1$ £g> A 2 <g> 7^. To achieve this representation we start by 
introducing a mapping of [Z min , Z max ] to [—1, 1] with the function x{Z)i so that Z dependence 
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can be expressed with the Chebyshev polynomials T p (x(Z))- Similarly £(p) maps [p m i n , Pma j 
to [—1,1]. The solution is then formally expressed as 



ifi(p,<t>,Z) = ^^^on P T n (^(p))T p ( X (Z)) 

n=0 p=0 

oo oo oo 

+ [^2k-i,n P cos(k6)+^ 2ktnp sm(k9)}T^p))T p ( X (Z)), (54) 

fc=l n=0 p=0 

with corresponding numerical truncation (taking even for simplicity) 

Vn^nMpA, Z) = ^^^ 0np T n (£(p))T p (x{Z)) 

n=0 p=0 

5^0 N p Nz 

+ [^2fe-i,n P cos(A;0) +ip2k,n P sm(k(j))]T n ^(p))T p (x(Z)). (55) 

k=l n=0 p=0 

The direct product structure in ( 1531 has been determined by the convention 

ij)(m(Np + l)(iV 2 + 1) + n(iV z + 1) + p ) = $ mnp . (56) 

We now take the matrix representation of the differential operator in (I5TI) . and "sparsify" it 
via multiplication by B = B 2 Cg> B 2 ™ g) -Bf j 2 j. In this operator the B 2 ^ and -Bf^j correspond 
to the usual B operators representing integration twice over a coordinate, and leaving two 
empty rows to be filled by tau-conditions. The operator B 2 applies to a coordinate with no 
endpoints, and hence no applicable boundary conditions. It represents double integration 
over all Fourier modes except the zero mode, which is left unchanged. The matrix that 
accomplishes this has the explicit form B^ = diag(l, — 1,— 1,— j, — |, — |, — §,•**)■ Although 
the operation on does not play a role in handling boundary conditions, it should further 
enhance the spectrum clustering of the matrix that must be inverted. 
With this B the sparsified form of f )53|) becomes 

BA 2 p A = Bl®{Bl [2] +I p[2] Al-?>B p[%] A p )®B^ (57) 

Our analysis of the terms in the HRWE proportional to Q 2 starts with the expressions 

p 2 dx = p 2 cos <fid p — p sin tfid^ (58) 
p 2 d 2 x = p 2 cos 2 (fid 2 — 2p cos <fi sin <fid p d$ + 2 cos sin cfid^ + p sin 2 <fid p + sin 2 (fid 2 . (59) 

With the operator identities p 2 <9 2 = <9 2 p 2 — Ad p p + 2, pd p = d p p — 1, and p 2 d p = d p p 2 — 2p, 
we then find 



p 2 dx = d p p 2 cos <fi — p(2 cos <fi + sin 0<9^) (60) 
p 2 d 2 x = d 2 p 2 cos 2 <fi + d p p(l — 5 cos 2 <fi — 2 cos cfi sin 0<9^) 

+ sin 2 (fidl + 4 cos <fi> sin 000 + 3 cos 2 — 1. (61) 
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Still viewing these relationships as operator identities, we now exploit the product rule for 
differentiation to rewrite the terms involving angular derivatives, with the results 



p 2 dx = d p p 2 cos — p(cos <t> + sin 0) (62) 
p 2 d 2 x = d 2 p 2 cos 2 — d p p(l + cos 2 + 2c^ cos sin 0) + d 2 ^ sin 2 + sin 2 0. (63) 

These equations then yield 

p 2 d x X = d p p 3 cos 2 4> — p 2 (cos 2 4> + cos sin 0) (64a) 
p 2 d 2 x = (d 2 p p 2 -3d pP +l)cos 2 ( j ) 

- <9 p p(l - 2 cos 2 + 2fy cos sin 0) + d\ sin 2 + 2 sin 2 0- 1, (64b) 

expressions which prove useful in obtaining the matrix representation of the operator on the 
righthand side of (15Tj) . To optimize the implementation, we have chosen the first term on 
the righthand side of fl64b ) to match a similar term in the Laplacian part of the operator 
[cf. Eq. (J52D]. 

We split the HRWE operator on a cylindrical shell as p 2 L = p 2 A — Q 2 J, where 



J = J x + J 2 + J 3 + J 4 = p 2 d x X{l - 2d z Z) + p 2 d 2 x Z 2 + p 2 d z Z + p 2 d 2 z X 2 . (65) 





./ J2 J3 Ji 



The piece p 2 A was shown to lead to fl57|) . We now focus on J whose matrix representation 
stems from the representations of its constituents. Equations fl63|) and f )64|) give us 

Ji = C\ ® g) (Jz - 2D Z A Z ) - (Cj + D^C^S^) ®A 2 p ® (I z - 2D Z A Z ) (66) 
J 2 = C 2 ® (J + D 2 A 2 - 3D P A P ) ® A% - (J - 2C 2 + 2D^C^) ® D P A P ® A 2 Z (67) 



(D 2 Sl + 2S 2 -l)®I p ®A 2 



z 



h = I<p ® A 2 ® (68) 



J 4 = CJ®^® J D|, (69) 

where and are respectively the matrices in the Fourier basis which correspond to 
multiplication by sin0 and cos0. Applying the sparsifying matrix E = B 2 ® B 2 ^ <g> B z ^, 
we then have 

Eh = B\C\ ® B p[2] A 3 p ® {B 2 m - 2B m A z ) (70) 

- (B 2 C 2 + B^C^) ® B 2 p[2] A 2 p ® (B 2 Z[2] - 2B m A z ) 

Eh = B\C\ ® (B 2 r[2] + I p[2] A 2 p - 3B r[2] A p ) ® B 2 Z[2] A 2 Z (71) 

- {B\ - 2S|C| + 2B m C^S^) ® B p[2] A p ® B 2 Z[2] A 2 Z 
+ {I m S 2 + 2Bpl - B 2 ) ® B 2 p[2] ® B 2 m A\ 

Eh = B\® B 2 p[2] A 2 p ® B m A z (72) 
Eh = BlCl®B 2 p[2] A 4 p ®I z[2] . (73) 

The sparsified matrix representing ([EED is then EA 2 C = EA P A - Vt 2 (Eh + Eh + Eh + Eh). 
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Boundary 


Rows 


Index restrictions 


P — Pmin 
P /'max 


m(N p + 1)(N Z + l)+p 

m(N p + l)(N z + l) + (N z + l)+p 


< m < Nf, 2<p<N z 
0<m<N^, 2 < p < N z 


Z = Z m i n 

7—7 

*J ^max 


m(N p + 1)(N Z + 1) + n(N z + 1) + 
m(N p + 1){N Z + 1) + n(N z + 1) + 1 


0<n<N p , < m < 
0<n<N p , < m < 



TABLE IV. Filling of empty rows for cylinders. 



To enforce boundary conditions, we fill empty rows in the matrix BA P A - Vl 2 BS with 
the tau- conditions. Let h + (cj),Z) = ip(p max ,<f),Z), h~(<fi,Z) = ^(/Wi, 0, Z) and f + (p,4>) = 
ip(p, 0, Z max ), f~(p,cj), Z min ) = ip(p, (ft, Z min ). Then Dirichlet boundary conditions on the 
inner and outer axial boundaries and on the top and bottom caps are expressible as 

N P N z 

2^ ^Pmnp^n = h mp , Ipmnp^p = f mn - (^4) 

n=0 p=0 

There are (N^ + 1)(N Z + 1) + (N p + 1)(-A^, + 1) such equations possible. However, owing to 
the fact that the caps shares common edges with both the inner and outer axial boundaries, 
there are 2(N ( f > + 1) linear dependencies amongst these equations, and in fact the number of 
available empty rows is precisely 

(N^ + l)(N z + 1) + (N p + 1)(JV + 1) - 2(N^ + 1). 

Table [TV] shows how we fill zero rows to enforce the boundary conditions. 

III. GLUING OF SUBDOMAINS 

So far we have described individual shell, cylinder, and block subdomains (and their 
associated tau-conditions) as if they were decoupled. All the subdomains are, of course, 
coupled and we refer to the process of making them parts of a single problem as "gluing." 
Matching, or gluing, must be done for each subset of subdomains that touch, whether that 
touching is a finite volume overlap or a lower-dimensional shared boundary. The global 
problem requires matching for the following subdomain configurations: 

(i) Two adjacent cylinders. 

(ii) One inner shell and a combination of cylinders and blocks. 

(iii) One cylinder and one block. 

(iv) The outer shell and the combination of blocks B and D and all cylinders. 

We describe (i) and (ii) in detail, provide a sketch of (iii), and omit a description of (iv) 
altogether. Although more complicated, a description of (iv) would parallel that of (iii). 

Before giving more details, we comment on how such gluing is reflected in the overall 
linear system. Let, for example, ip J and ip B respectively represent the vectors of spherical- 
harmonic Chebyshev and triple Chebyshev expansion coefficients corresponding to the inner 
shell J and block B of Fig. [TJ The overall set of unknowns is the concatenation 

* = (V> J , 4> B , V> D , V>\ V' 2 , V> 3 , ^\ V> 5 , V>°) 4 , 
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which satisfies a linear system stemming from Eq. ([3]), 

= BG, (75) 

where Q is a similar concatenation of the sources g on the individual subdomains. Here 
B indicates integration "preconditioning" (sparsification) on all subdomains. Symbolically 
then, the coefficient matrix M. is BC, now with L standing for the spectral representation 
of the HRWE operator L on the whole 2-center domain. In this symbolic view, we have 
ignored multiplications by radial powers on spheres and cylinders. 

Each of the eleven subdomains in Fig. [TJis represented by one of eleven super-blocks (J- J, 
H-H, ■ ■ • , O-O) which sit along the diagonal of the overall super-matrix Ai representing the 
PDE on the whole 2-center domain. We use the term "super-block" here since the matrix 
corresponding to each subdomain arises, as we have seen, from a direct product structure 
(and so could be viewed as already in a block form). The supplementary equations needed for 
gluing are placed within existing empty rows in the same manner as for the tau- conditions. 
However, the gluing conditions stretch beyond the super-block diagonal, since they are 
linear relationships between the spectral expansion coefficients on two (or more) separate 
subdomains. For example, the gluing together of cylinders 1 and 2 (which share a common 
cap) involves not only filling rows within the 1-1 and 2-2 super-blocks along the diagonal of 
Ai, but also filling rows within the 1-2 and 2-1 off-diagonal super-blocks. 



A. Gluing of cylinders to cylinders 

As a specific example, let us consider the gluing of cylinders 1 and 2 in Fig. [TJ which as 
indicated share the cap Z = Z*, where Z* is Z max for cylinder 1 and Z min for cylinder 2 (the 
common cap has a hole in the middle, since 1 and 2 are cylindrical shells). Let, for example, 
Vip 1 be shorthand for the numerical solution V^^i^iip 1 for cylinder 1, as expressed in (155]) . 

The restriction Vijj l (p,4>, Z*) is a two-variable function on the cap Z = Z*, and it can be 

expanded in a finite Fourier-Chebyshev series, with J^/bfo ^Ink^k as the corresponding two- 
index modal coefficients. Likewise, the restriction (dVip 1 /dZ)(p,(j), Z*) of the Z-derivative 

has a Fourier-Chebyshev series with two-index modal coefficients JZk^o^qn^^k ■ The a i 
factor is a scaling of the Neumann vector u + , and its presence is necessary since the range 
of Z is not [—1, 1] (details are given in 0]). 

On the Z = Z* cap we likewise consider the numerical solution Vil> 2 (p,(j), Z*) and its 
Z-derivative (dVtp 2 /dZ)(p, <ft, Z*), as determined by the numerical solution Vtp 2 on cylinder 
2. We distinguish between two cases: (i) both the N p and N$ truncations are the same for 
cylinders 1 and 2 (but N\ ^ N\ is allowed), and (ii) at least one of these truncations differs 
between the two cylinders (i.e. either N ^ N 2 or ^ N 2 , or both, hold). Let us first 
consider case (i), returning to case (ii) in the next paragraph. For case (i) both 7V(p,0,Z*) 
and Vijj 2 (p,<f), Z*) have two-surface modes which are in one-to-one correspondence, and 
likewise for the derivatives. Therefore, for this case we enforce^ 

^lnk S k = Yl ^lnk S k> ^\nk a ^t = Yl ^%* a * V k ' ( 76 ) 

k=0 k=0 k=0 k=Q 

4 We regret an error in the definition of v~ in Ref. Eq. (42). The correct expressions are = 
[T{,(±l),Tl(±l),m±l),m±l),Ti(±l),- ■ ■] = [0, 1,±4,9,±16, •••]. In Ref. the righthand side of the 
second equation of (69) is also off by a sign. 
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for each Fourier-Chebyshev index pair (q,n). Here, for case (i), the matching conditions 
enforce continuity between the finite representations Vip 1 and Vip 2 across the cap, and 
also continuity between the finite representations dVip 1 /dZ and dVip 2 /dZ. These matching 
conditions are reflected in the overall matrix Ai as follows. As the super-block corresponding 
to each of the subdomains 1 and 2 has been sparsified in the described fashion, each has 
a collection of empty rows which are also empty throughout At. In, say, the empty rows 
stretching across the 1-1 and 1-2 super-blocks, we insert the first set of conditions given in 
f!76|) . In the empty rows stretching across the 2-2 and 2-1 super-blocks, we similarly place 
the Neumann conditions, the second set of conditions given in f!76|) . This filling of empty 
rows to achieve the required matching consists of relationships of modal coefficients with no 
reference to any "sources" ; they are homogeneous equations. 

To better understand the issues which will arise in matching volume-overlapping sub- 
domains, we now consider case (ii), the case in which the cylinders 1 and 2 give rise to a 
disparate set of surface modes on the Z = Z* cap. In this case, for example, we again have 

J2k=o ^qnkK as the modal coefficients determining Vip l (p, (p, Z*), and Ylk=o ^\nk a ^ v k as tne 
modal coefficients determining (dVip 2 /dZ)(p,(p, Z*). Now, however, (ITSj) is not applicable. 
Instead, we now fix [cf. the first equation in f )74p ] 

Nl „ 

E^**" 2 "*' (77) 
k=0 

where here /+ (for < q < Nj and < n < N}) and e~ n (for < q < JVj and < n < N 2 ) 
are not to be viewed as inhomogeneities, rather as expressions built respectively with the 
modal coefficients for Vip 2 (p, (p, Z*) and (dVip 1 /dZ)(p, (p, Z*). Note that, as with Eqs. (ITSj) . 
these equations have the form "cylinder 1 coefficients = cylinder 2 coefficients". 

Let us consider only /+ since similar comments apply to e~ n . First, we start with 
cylinder 1 and define a Chebyshev-Lobatto/Fourier grid {(pj,(pi) '■ < j < N^,0 < i < 
N^} on the Z = Z* cap of cylinder 1. The use of these points affords a double discrete 
Fourier-Chebyshev transform, through numerical quadrature, relating function values at the 
points and mode coefficients. (In practice, we have exploited the trigonometric form of the 
Chebyshev polynomials and have used the FFT to define both the Fourier and Chebyshev 
components of this transform.) The double discrete transform allows us to express the modal 
coefficients f+ n in terms of the function values f^, at p iy <pj, for Z = Z* on cylinder 1, in a 
form 

p 4> 
i=0 j=0 

Next, at the nodal points (pj, (pi) of cylinder 1, we evaluate f£ in terms of the expansion for 
the solution on cylinder 2, thereby finding 

ft = ^ 2 (P„ 0i, Z.) = J2 £ii«n E ^nA- (79) 

q=0 n=0 k=0 

Here, the values £ij, qn arise from the evaluations of the modal functions (Chebyshev and 
Fourier) of cylinder 2 at the nodal points (pj, (pi) of cylinder 1. When the expressions for /rt 

from fl79|) are substituted in (175]). we get expressions for /+ in terms of the modal coefficients 



E 

k=0 



^qnk^k 



/« 



qn 



'qn 
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i/jq nk representing the solution in cylinder 2. Finally, we substitute this f+ n into (1771) . which 
yields relationships between the modal coefficients on cylinder 1 and cylinder 2 that express 
continuity of the solution across Z = Z*. 

The linear relationships ( I77|) would likewise be inserted into the overall coefficient matrix 
Ai. Similar to before, the righthand side of the first equation in (J77J would fill empty rows 
stretching across the 1-2 super-block, with the 5 + vectors on the lefthand side filling empty 
rows across the 1-1 super-block. The relationships expressed in the second equation in (I77|) 
would fill empty rows stretching across the 2-2 and 2-1 super-blocks Finally, we note that 
the equations (fTTJ reduce to ([76]) when JVj = iVj and JVj = N 2 p . 

B. Gluing of an inner shell to cylinders and blocks 

The shells J and H depicted in Fig. [1] overlap multiple blocks and cylinders, and for this 
overlap the issue of gluing is complicated. Since the issue is essentially the same for the 
gluing of H to blocks C, D and cylinders 3,4,5 or J to blocks B, C and cylinders 1,2,3, let 
us here focus on the first case. The issue here is that parts of the outer boundary dH + of 
H sit in blocks C,D and cylinders 3,4,5. Let # represent one of the tags C, Z),3,4,5, and 
let us consider the portion dHt of dH + which intersects subdomain j^. At nodal points 
on dH^ we require that the values of ip agree whether they are computed with the spectral 

representation ip for H or the spectral representation -0* for For nodal points (9j,<fi k ) 
this condition is [cf. Eq. f l50|) ] 

h+ k = h+iOj, <j) k ) = 7^ # (x(r max , 9 k , 0,)) for (r max , 9 j: <j) k ) e dH+. (80) 

Here Vip^ is the numerical solution (V indicates finite expansion) associated with and 
x are the relevant 3D coordinates on subdomain Looping over all of the sub domains 

# = C, D,3,4, 5 defines the grid function h^ k at all nodal points of dH^. The explicit 
matching conditions (equivalent to the + case in (1501 ) can then be realized by expressing 
the spherical harmonic transform h~\ q = J2flo Sfc=o ^<?j^j7c as a matrix- vector product 
involving all The resulting equations are placed within empty rows of Ai which stretch 
across the H-H and H-# super-blocks. 

Again let # represent one of the tags C, D, 3, 4, 5. Then the boundary d# of subdomain 

# includes a portion dj^u overlapping shell H which gives rise to further gluing relations. 
These equations will be inserted into empty rows of Ai which stretch across the and 
H^-H super-blocks. For concreteness, we consider only the # = C case. Here the + case of 
f|T2|) is relevant, although the h^ m now arise as the double XF-Chebyshev transform of 

h tk = Y,J2^onPeo(cos8 jk )T n (£{r jk )) 

n=0 1=0 

N r N() Ng 

+ Yl p em( cosd jk) [i>f,2m-i, n cos(mcf> jk ) + ^f 2 m,n sm(m<p jk )] T n {£{r jk )) . (81) 

n=0 1=1 m=l 

The H point (r jfc , 9j k , (j>j k ) corresponds to a Chebyshev-Gauss-Lobatto collocation point 
(X(£j), Y(i] k ), Z max ) along the top AF-face of block C. Again, via a matrix representation 

hnm = Sj=^o ^2k=o ^nm,jkh^ k of the transform, we may express this matching condition more 
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directly. As mentioned, these equations will be inserted into empty rows of Ai which stretch 
across the C-C and C-H super-blocks. 

C. Gluing of a cylinder to a block 

Here we sketch either the gluing of block B and cylinder 1, block C and cylinder 3, or 
block D and cylinder 5. We focus on the middle CcLSG clS cL representative example. This 
process involves both (a) gluing two Y Z and two XZ faces of block C to cylinder 3, and (b) 
gluing the inner radial boundary of the cylinder to the block. The process for (a) is similar 
to the gluing described in the last paragraph (in which a face of C is glued to H), and we 
omit a description. To express the matching equations which enforce (b), we first define 

N x N Y N z 
n=0 m=0 p=0 

Here we use the following points: 

(Xjk,Y jk , Zj k ) = (X(p min , <f)j, z k ), Y(p min , <f)j, z k ), Z 

(Pmin; 0ji Zk)), (83) 

where (p m i n , 4>j, Zk) are nodal points along the inner radial boundary of cylinder 3. Next, 

we consider the Fourier-Chebyshev transform g~ p = Ej1*oEfco^™pj'^ifc' m terms of the 
transform the matching equations are 

Np 

E ^mnpK = Qmp- ( 84 ) 
n=0 

These equations are inserted into empty rows of M. which stretch across the 3-3 and 3-C 
superblocks. 

IV. NUMERICAL SOLUTION OF THE 3D HRWE 

Both on single subdomains and on the global 2-center multidomain £>, this section con- 
siders numerical solution of the HRWE for the field of two point sources in a circular binary 
orbit. For this problem we have an essentially closed- form exact solution, a superposition of 
the fields for two point sources, each point source in a circular orbit and described by the 
Lienard-Wiechert solution (1A11I) found in the appendix. A numerical solution is a collection 
of modal expansion coefficients; however, comparisons with the exact solution are always 
computed in physical space on the nodal grid (or grids in the multidomain case) dual to the 
modal expansionjf] All numerical solves are performed iteratively with preconditioned GM- 
RES jlH, and this section also describes the relevant preconditioning (both for subdomain 
solves and for the global multidomain solve). 

Using the sparse representations described in Sec. [Til i n Sec. II V Al we numerically solve the 
HRWE on the following subdomains (cf. Fig.[T]and Tabled]): the outer shell O, (inner) spher- 
ical shell J, (inner) cylindrical shell 5, and (inner) block D. For each subdomain labeled (in- 
ner) the HRWE operator is implemented as a matrix-vector multiply within preconditioned 



5 As these nodal grids are coarse, the Li and norms reported in the tables do not settle down quickly. 
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(a) Cylinder experiment. (b) Block experiment. 



FIG. 2. Individual subdomains. We consider cylindrical shell 5 highlighted in the left figure, 
block D highlighted in the right, and the bottom inner spherical shell J shown in both. 

GMRES without restarts. For these subdomain solves, Dirichlet boundary conditions are 
taken from the exact Lienard-Wiechert solution, but the outer shell also problem involves 
the radiation boundary conditions given in Eq. ( 12 5j) . The particular subdomains considered 
in Sec. IIV Al are representative, and similar experimentation on each subdomain has de- 
termined the chosen truncations for the 2-center multidomain tests described in Sec. IIV Bl 
Such experiments empirically yield appropriate truncations necessary to achieve a desired 
accuracy. All tests in Sees. IIV Al and IIV Bl involve the following configuration: two charges, 
one with zh = 1, Qh = 1 and the other with zj = —0.9, Qj = 0.5. Sec. IIV Al considers 
Q = 0.1, 0.3, 0.5, 0.7. The rotation rate Q = 0.3 is large for an astrophysical problem, while 
Q = 0.5, 0.7 are very large rates chosen to "break" our numerical methods. 

A. Numerical solution on individual subdomains 

We consider the outer shell first, since results for this subdomain are the most disappoint- 
ing. The solve for this subdomain differs from the rest. Indeed, since the representation 
of the HRWE on the outer shell is comprised of (£, m) blocks along the block diagonal, 
we invert each of these physical modes using //[/-factorization [the "physical modes" are 
those not annihilated by the projection operator P defined in ffT7"|) ]. Let Ng = £ ma , x , so that 
M = (N e -\-l)(2N e + l)(N r + l) is the system size, with N" 2 ~ ANqN 2 the storage requirement 
for the full coefficient matrix. However, storage of all blocks involves (No + 1) matrices of 
size (N r + l)-by-(A 7 " r + 1), one for each zero mode, in addition to ^N e (N e + 1) matrices of 
size 2(N r + l)-by-2(N r + 1), one for each fixed-m cos/sin pair. Therefore, storage for this 
solve scales like 

(N e + l)(N r + l) 2 + 2N e (N + l)(N r + l) 2 ~ 2N$N? = 0{N r ■ Af). (85) 

Table M collects results for the outer shell experiment. While excellent for Q = 0.1, they 
exhibit marked degradation as Q increases. A splitting of the single outer shell into multiple 
concentric shells would likely yield improved accuracy for large Q. 
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Q = 0.1 


N r 


^raax 


L2 error 


Li norm 


Loo error 


Loo norm 


65 


10 


1.1899E-05 


1.8977E-01 


2.7970E-04 


1.1808E+00 


85 


18 


2.1320E-07 


1.8725E-01 


3.6548E-06 


1.1810E+00 


125 


28 


2.4580E-10 


1.8472E-01 


5.4193E-09 


1.1810E+00 


185 


42 


2.2846E-13 


1.8297E-01 


2.7858E-12 


1.1810E+00 



n = 0.3 


N r 


^max 


Z/2 error 


L2 norm 


Loo error 


Loo norm 


65 


10 


7.4142E-04 


1.9030E-01 


1.0557E-02 


1.2624E+00 


85 


18 


9.4481E-04 


1.8777E-01 


1.3180E-02 


1.2628E+00 


125 


28 


1.2701E-04 


1.8523E-01 


2.0861E-03 


1.2628E+00 


185 


42 


5.1221E-06 


1.8347E-01 


1.2307E-04 


1.2628E+00 



Q = 0.5 


N r 


^max 


L2 error 


L2 norm 


Loo error 


Loo norm 


65 


10 


1.4622E-02 


1.9196E-01 


1.9489E-01 


1.4875E+00 


85 


18 


5.6234E-02 


1.9726E-01 


7.8285E-01 


1.4904E+00 


125 


28 


6.7047E-03 


1.8689E-01 


1.4751E-01 


1.4996E+00 


185 


42 


3.6457E-03 


1.8503E-01 


7.6831E-02 


1.5055E+00 



Q = 0.7 


N r 


^max 


L2 error 


L2 norm 


Loo error 


Loo norm 


65 


10 


1.8330E-01 


2.6488E-01 


4.2561E+00 


4.9750E+00 


85 


18 


3.8033E-02 


1.9502E-01 


7.1574E-01 


2.0804E+00 


125 


28 


3.9821E-02 


1.9408E-01 


1.0589E+00 


2.2504E+00 


185 


42 


2.7380E-02 


1.9029E-01 


7.3279E-01 


2.2182E+00 



TABLE V. Outer spherical shell O test 



We next consider the inner spherical shell J. Note that Q = 0.5 corresponds to a shell 
just within the elliptic region, but Q = 0.7 corresponds to a shell which does not lie fully 
within the elliptic region. Tables EH and IVHI list errors, without and with precondition- 
ing. For the sake of comparison, in both these and subsequent tables we have chosen the 
same requested tolerances (for the GMRES solve) uniformly in Q, although for larger Q the 
achieved accuracy could likely be attained with a weaker tolerance and fewer iterations. The 
chosen preconditioner is block- Jacobi. Namely, we invert physical (£, m) modes along the 
block diagonal using a precomputed L[/-factorization. The storage and scaling properties for 
this preconditioner are exactly the same as described for the direct solve on the outer shell. 
However, for inner shells the HRWE representation is not block diagonal in (£, m) pairs (as 
on the outer shell), rather the operator has significant bandwidth in both indices. Therefore, 
storage of the full matrix for an inner shell would require correspondingly larger memory 
relative to the preconditioner storage. Further, the preconditioner storage requirement could 
be reduced by inverting each sin/cos block mode independently. Moreover, were the precon- 
ditioner chosen to correspond to only the Laplacian part of the operator, then it could be 
used for the solves on both shells if their dimensions and truncations were the same. In any 
case, the chosen preconditioner notably improves the convergence of the GMRES solver. 

Table IVIIII list the results for the corresponding single cylinder experiment, with block 
L£/-preconditioning similar to before. That is, for each Fourier mode we invert the associated 
diagonal block. Our choice f )56|) of direct product structure for the cylinders determines 
that each block is (N p + l)(Nz + l)-by-(AT p + l)(Nz + 1). For cylinders, preconditioning 
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Q = 0.1 


N r 


^raax 


L2 error 


L2 norm 


Lao error 


Loo norm 


iterations 


tolerance 


12 


12 


3.4702E-06 


1.3516E+00 


2.2656E-05 


1.9143E+00 


54 


1.0000E-07 


18 


23 


9.9814E-09 


1.3499E+00 


3.6488E-08 


1.9168E+00 


129 


1.0000E-09 


20 


33 


6.0107E-11 


1.3498E+00 


2.6944E-10 


1.9173E+00 


238 


1.0000E-11 


30 


46 


6.2864E-13 


1.3481E+00 


2.8333E-12 


1.9176E+00 


415 


1.0000E-13 



fl = 0.3 


N r 


^max 


Z/2 error 


L2 norm 


Loo error 


Loo norm 


iterations 


tolerance 


12 


12 


7.6593E-06 


1.3566E+00 


7.8541E-05 


1.9255E+00 


57 


1.0000E-07 


18 


23 


2.1272E-08 


1.3550E+00 


3.8188E-07 


1.9278E+00 


138 


1.0000E-09 


20 


33 


1.1806E-10 


1.3550E+00 


2.4707E-09 


1.9283E+00 


255 


1.0000E-11 


30 


46 


4.3184E-13 


1.3532E+00 


5.2289E-12 


1.9286E+00 


442 


1.0000E-13 



n = 0.5 


N r 


^max 


L2 error 


L2 norm 


Loo error 


Loo norm 


iterations 


tolerance 


12 


12 


2.3588E-05 


1.3782E+00 


2.7337E-04 


1.9561E+00 


61 


1.0000E-07 


18 


23 


8.3249E-08 


1.3769E+00 


1.5539E-06 


1.9579E+00 


146 


1.0000E-09 


20 


33 


4.9181E-10 


1.3770E+00 


1.1765E-08 


1.9583E+00 


271 


1.0000E-11 


30 


46 


7.9161E-13 


1.3752E+00 


1.8530E-11 


1.9585E+00 


471 


1.0000E-13 



n = 0.7 


N r 


^max 


L2 error 


L2 norm 


Loo error 


Loo norm 


iterations 


tolerance 


12 


12 


4.7252E-04 


1.4267E+00 


3.1710E-03 


2.2326E+00 


87 


1.0000E-07 


18 


23 


1.1589E-05 


1.4259E+00 


1.2079E-04 


2.2356E+00 


461 


1.0000E-09 


20 


33 


8.3186E-08 


1.4262E+00 


6.8077E-07 


2.2353E+00 


1459 


1.0000E-11 


30 


46 


1.3484E-09 


1.4243E+00 


1.4001E-08 


2.2351E+00 


5215 


1.0000E-13 



TABLE VI. Inner spherical shell test J without preconditioning. 

amounts to direct inversion of each Fourier mode along the block diagonal. With M = 
(N^ + l)(N p + 1)(N Z + 1) the system size, the storage requirement for the preconditioner 
requires + 1 matrices of size (N p + l)(N z + l)-by-(A r p + 1)(N Z + 1), and so scales like so 

(N p + 1) 2 (N Z + l) 2 (iV + 1) = 0(N p N z ■ A/"). (86) 

While N p N z M < A/" 2 , this requirement is somewhat memory intensive. However, we have 
observed essentially the same performance when using the corresponding Laplacian part of 
the operator to define the preconditioner. Provided that the dimensions and truncations of 
two individual cylinders match, the same preconditioner could then be used for both. 

Table [IX] list errors for the block experiment, and again with a block- Jacobi precondi- 
tioner. In this case there are N x + 1 blocks with size (N y + 1)(N Z + l)-by-(N y + 1)(N Z + 1). 
Storage of the block preconditioner therefore scales as 

(N x + l)(N y + 1) 2 (N Z + l) 2 = 0(N y N z ■ AT). (87) 

Again, were the preconditioner based on the Laplacian part of the operator, it might be 
reused for the solves on different blocks. 

B. Numerical solution on the 2-center multidomain 



We have also used GMRES 35] to solve the linear system A4*& = EQ given in Eq. ( ITS]) and 
corresponding to the HRWE on the full 2-center multidomain T>. Section |TT] has described 
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n = o.i 


N r 


^raax 


L2 error 


L2 norm 


Loo error 


Loo norm 


iterations 


tolerance 


12 


12 


3.4196E-06 


1.3516E+00 


2.2499E-05 


1.9143E+00 


3 


1.0000E-07 


18 


23 


3.4877E-09 


1.3499E+00 


3.6560E-08 


1.9168E+00 


4 


1.0000E-09 


20 


33 


1.3949E-11 


1.3498E+00 


2.1394E-10 


1.9173E+00 


4 


1.0000E-11 


30 


46 


3.0104E-14 


1.3481E+00 


3.0975E-13 


1.9176E+00 


5 


1.0000E-13 



Q = 0.3 


N r 


^max 


L2 error 


L2 norm 


Loo error 


Loo norm 


iterations 


tolerance 


12 


12 


7.6487E-06 


1.3566E+00 


7.8536E-05 


1.9255E+00 


4 


1.0000E-07 


18 


23 


2.0798E-08 


1.3550E+00 


3.8215E-07 


1.9278E+00 


6 


1.0000E-09 


20 


33 


1.0362E-10 


1.3550E+00 


2.4686E-09 


1.9283E+00 


7 


1.0000E-11 


30 


46 


1.0361E-13 


1.3532E+00 


3.1604E-12 


1.9286E+00 


9 


1.0000E-13 



n = 0.5 


N r 


^max 


L2 error 


L2 norm 


Loo error 


Loo norm 


iterations 


tolerance 


12 


12 


2.3577E-05 


1.3782E+00 


2.7330E-04 


1.9561E+00 


7 


1.0000E-07 


18 


23 


8.3163E-08 


1.3769E+00 


1.5545E-06 


1.9579E+00 


10 


1.0000E-09 


20 


33 


4.9053E-10 


1.3770E+00 


1.1755E-08 


1.9583E+00 


13 


1.0000E-11 


30 


46 


6.2389E-13 


1.3752E+00 


1.8454E-11 


1.9585E+00 


17 


1.0000E-13 



n = 0.7 


N r 


^max 


L2 error 


L2 norm 


Loo error 


Loo norm 


iterations 


tolerance 


12 


12 


4.7262E-04 


1.4267E+00 


3.1713E-03 


2.2326E+00 


30 


1.0000E-07 


18 


23 


1.1596E-05 


1.4259E+00 


1.2093E-04 


2.2356E+00 


154 


1.0000E-09 


20 


33 


8.3367E-08 


1.4262E+00 


6.8017E-07 


2.2353E+00 


429 


1.0000E-11 


30 


46 


1.3430E-09 


1.4243E+00 


1.3999E-08 


2.2351E+00 


1390 


1.0000E-13 



TABLE VII. Inner spherical shell J test with preconditioning. 



the coefficient matrix Ai, and therefore also implementation of the "matrix- vector multiply" 
— > Ai^f. Implementation of this multiply is required by the GMRES algorithm (with or 
without preconditioning). However, a simple unpreconditioned GMRES strategy results in 
extremely poor convergence. Therefore, we have implemented (left) preconditioned GMRES 
which further requires implementation of the operation ^ — > A4~ p p rax \l/ i n terms of a suitable 
approximate inverse M~p prox ~ M.~ x . In this section we describe application of M.~p praic , 
and document tests of the full global solve. We stress that the preconditioning afforded 
by -Mapprox is neither (i) the integration "preconditioning" technique used to achieve sparse 
representations of (jl]) on each of the basic subdomains nor (ii) the preconditioning (typically 
a form of block-L£7) used for individual subdomain solves. However, type (ii) preconditioning 
does define part of the A4~ p rox application 



The action of -M~p rox is defined through the simple alternating Schwarz method (42 
Application of this preconditioner relies on independent numerical solves over (i) the inner 
shells J and H, (ii) the glued subregior@ R comprised of blocks and cylinders depicted in 
Fig. El and (iii) the outer spherical shell O. More precisely, starting with a vanishing initial 
vector we perform the following iteration. 

1. Solve (also by GMRES, as described in Sec. IIV A\) the HRWE on the inner shells J and 
H. For these solves inner Dirichlet boundary conditions are the fixed physical ones, 



6 Whereas the basic spectral elements (such as shell J, block B, and cylinder 1) have been called subdomains, 
we informally refer to the multidomains R and G (defined later) as a subregions. 
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= 0.1 


N r 




N z 


L,2 error 


L2 norm 


Loo error 


Loo norm 


iterations 


tolerance 


13 


5 


7 


7.9884E-08 


9.0116E-01 


4.6388E-07 


1.5004E+00 


3 


1.0000E-08 


19 


9 


9 


5.2802E-10 


8.9887E-01 


2.7463E-09 


1.5006E+00 


4 


1.0000E-10 


23 


13 


13 


5.6239E-13 


8.9775E-01 


4.4170E-12 


1.5006E+00 


5 


1.0000E-12 


29 


19 


18 


8.3992E-15 


8.9680E-01 


9.3259E-14 


1.5007E+00 


6 


1.0000E-14 



fl = 0.3 


N r 




N z 


L2 error 


L2 norm 


Loo error 


Loo norm 


iterations 


tolerance 


13 


5 


7 


6.2980E-06 


9.4046E-01 


3.8531E-05 


1.5817E+00 


6 


1.0000E-08 


19 


9 


9 


1.2577E-07 


9.3796E-01 


5.9139E-07 


1.5841E+00 


10 


1.0000E-10 


23 


13 


13 


4.9307E-09 


9.3677E-01 


4.0773E-08 


1.5849E+00 


14 


1.0000E-12 


29 


19 


18 


3.2422E-10 


9.3574E-01 


1.3965E-09 


1.5861E+00 


18 


1.0000E-14 



n = 0.5 


N r 


Afy 


N z 


L2 error 


L2 norm 


Loo error 


Loo norm 


iterations 


tolerance 


13 


5 


7 


7.3178E-04 


1.0239E+00 


3.9733E-03 


1.7886E+00 


59 


1.0000E-08 


19 


9 


9 


2.1193E-04 


1.0218E+00 


1.2227E-03 


1.8228E+00 


161 


1.0000E-10 


23 


13 


13 


4.0853E-05 


1.0203E+00 


1.8564E-04 


1.8303E+00 


531 


1.0000E-12 


29 


19 


18 


7.0259E-06 


1.0190E+00 


7.7578E-05 


1.8285E+00 


1576 


1.0000E-14 



n = 0.7 


N r 


Nf 


N z 


L2 error 


L2 norm 


Loo error 


Loo norm 


iterations 


tolerance 


13 


5 


7 


3.2460E+00 


3.4276E+00 


2.2390E+01 


2.1481E+01 


130 


1.0000E-08 


19 


9 


9 


2.3293E-02 


1.1548E+00 


1.9059E-01 


2.5635E+00 


503 


1.0000E-10 


23 


13 


13 


1.1546E+00 


1.6568E+00 


9.4853E+00 


1.1688E+01 


1420 


1.0000E-12 


29 


19 


18 


1.1305E-02 


1.1512E+00 


1.1635E-01 


2.5463E+00 


10816 


1.0000E-14 



TABLE VIII. Cylindrical shell 5 test with preconditioning. 



while outer boundary conditions stem from interpolation of the numerical solution on 
R (which is initially zero). The tolerance for these solves is typically 0. l*tol, where 
tol is the tolerance for the global GMRES solve of M^f = BQ. 

2. Solve (also by GMRES) the HRWE on R. For this solve inner Dirichlet boundary 
conditions stem from interpolation of the solutions on J and H, while outer Dirichlet 
boundary conditions stem from interpolation of the solution on the outer shell O 
(which is initially zero). This GMRES solve must also be preconditioned, as discussed 
shortly. The tolerance for this solve is typically 0.2*tol. 

3. Solve the HRWE on the outer spherical shell O, with inner Dirichlet boundary con- 
ditions stemming from interpolation of the numerical solution on R and the outer 
radiation boundary conditions described in Sec. Ill Al As described in Sec. IIV A\ this 
solve is performed via direct block-by-block LU factorization (note that the factor- 
ization of each block mode is precomputed and then used over and over in this third 
step). 

This three-step iteration may be viewed as the Gauss-Seidel method, here applied in block 
form to J |J H, R, O. Typically, we have chosen 4 sweeps of this block Gauss-Seidel method. 
Step 2 requires its own preconditioning to enhance convergence. Here we have again em- 
ployed the alternating Schwarz method, now with blocks corresponding to B, C, D, and the 
subregion G which is the composite of glued cylinders (1-5). This "inner" preconditioning 
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= 0.1 


N x 


Ny 


N z 


L,2 error 


L2 norm 


Loo error 


Loo norm 


iterations 


tolerance 


14 


14 


7 


3.7513E-07 


1.1367E+00 


4.2360E-06 


1.7854E+00 


41 


1.0000E-08 


19 


19 


9 


6.3235E-09 


1.1394E+00 


1.3616E-07 


1.8098E+00 


62 


1.0000E-10 


28 


28 


13 


1.4351E-11 


1.1418E+00 


3.0822E-10 


1.8040E+00 


102 


1.0000E-12 


32 


32 


18 


1.2749E-13 


1.1421E+00 


5.3182E-12 


1.8054E+00 


141 


1.0000E-14 



fl = 0.3 


N x 


Ny 


N g 


L2 error 


L2 norm 


Loo error 


Loo norm 


iterations 


tolerance 


14 


14 


7 


3.9020E-07 


1.1955E+00 


4.2789E-06 


1.8911E+00 


42 


1.0000E-08 


19 


19 


9 


6.4495E-09 


1.1986E+00 


1.4194E-07 


1.9176E+00 


65 


1.0000E-10 


28 


28 


13 


1.4448E-11 


1.2013E+00 


4.9319E-10 


1.9116E+00 


109 


1.0000E-12 


32 


32 


18 


8.4807E-14 


1.2017E+00 


2.9017E-12 


1.9131E+00 


154 


1.0000E-14 



n = 0.5 


N x 


Ny 


N z 


L2 error 


L2 norm 


Loo error 


Loo norm 


iterations 


tolerance 


14 


14 


7 


7.5596E-07 


1.3306E+00 


4.0594E-06 


2.1990E+00 


77 


1.0000E-08 


19 


19 


9 


9.7796E-09 


1.3351E+00 


1.6231E-07 


2.2244E+00 


266 


1.0000E-10 


28 


28 


13 


2.2103E-10 


1.3391E+00 


2.8426E-09 


2.2217E+00 


1495 


1.0000E-12 


32 


32 


18 


4.3594E-12 


1.3399E+00 


5.1787E-11 


2.2231E+00 


3089 


1.0000E-14 



n = 0.7 


N x 


Ny 


N z 


L2 error 


L2 norm 


Loo error 


Loo norm 


iterations 


tolerance 


14 


14 


7 


4.0731E-03 


1.5016E+00 


1.9997E-02 


3.0390E+00 


454 


1.0000E-08 


19 


19 


9 


3.9332E-04 


1.5065E+00 


2.1243E-03 


3.1084E+00 


1349 


1.0000E-10 


28 


28 


13 


2.3057E-06 


1.5118E+00 


1.3776E-05 


3.1019E+00 


5337 


1.0000E-12 


32 


32 


18 


1.0375E-06 


1.5133E+00 


6.6072E-06 


3.0967E+00 


20000* 


1.0000E-14 



TABLE IX. Block D test with preconditioning. The asterisk on 20000* indicates the 
convergence was halted before the tolerance had been achieved; the achieved tolerance 1.3 x 10~ 14 
was close to that requested. 



typically involves 5 sweeps, with appropriate interpolation. Each individual GMRES solve 
on B, C, D, and G uses the tolerance 0.1*tol. Table IXl depicts the overall multilevel 
preconditioning scheme. 

Before turning to tests of the full solve, we consider the solve on the multidomain sub- 
region G comprised of the glued cylinders (1-5). Again, this solve is performed as part of 
the preconditioner for step 2 of the global preconditioner (see Table IXl) . Table IXj collects 
errors and iteration counts associated with this solve for increasing truncations. Each solve 
documented in the table has been started with the zero vector as initial iterate, and here we 
employ restarting after 20 iterations. The reported iteration counts in Table IXll are cumula- 
tive over restarts. The individual block-L/7 preconditioning on each subdomain (1-5) is the 
only preconditioning used for this solve. Nevertheless, it suffices to drastically reduce the 
number of iterations (which would otherwise be in the thousands, with or without restarts). 



Results for the full solve appear in Table 1X1 1 1 Notice that the largest truncation involves 
more than half a million unknowns (597788 to be precise). In fact the number of unknowns is 
larger, since we add modes to shells, but here count only the "physical modes" for allowable 
(£, m) pairs (cf. Sec. Ill Aj) . Each solve in the table is used as the initial guess for the next, 
which is why the count of outer GMRES iterations goes down. 
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(a) Inner shells J and H. 



(b) Glued subregion R. 



(c) Outer shell O. 



FIG. 3. Alternating Schwarz preconditioner. Numerical solution of the HRWE each subdo- 
main/subregion above defines the preconditioner. Boundary conditions for the solves are obtained 
through subdomain/subregion interpolation as described in the text. For the outer shell shown in 
(c), the small dot in the center is, to scale, the inner configuration comprised of (a) and (b). 



— T> (GMRES solve, alternating Schwarz method as PC) 

J, H (GMRES solve with block-LC/ PC) 
R — 1 (GMRES solve, alternating Schwarz method as PC) 
O (direct block-LC/ solve) 

B (GMRES solve with block-LC/ PC) 
C (GMRES solve with block-LC/ PC) 
D (GMRES solve with block-LC/ PC) 
k G (GMRES solve with h\ocY-LU PC) 



interpolation 
between solves 



interpolation 
between solves 



TABLE X. Multilevel preconditioning scheme. 



V. CONCLUSION 



We close by summarizing the results of this paper and describing our outlook on future 
work. In both the summary and description, we discuss both our numerical methods and 
the physical problem we aim to solve. 



A. Results 



We have demonstrated the feasibility of solving a partial differential equation in three 
independent variables by modal spectral methods based on the technique of integration 
preconditioning. As designed, the technique yields an algorithmic way to achieve a sparse 
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a = o.i 



N 1 


9 


Nl 


ATf 


N 3 


N* 


N* 


L,2 error 


L2 norm 


Loo error 


Loo norm 


iterations 


tolerance 


13 


5 


7 


17 


7 


17 


7 


2.2251E-06 


9.8806E-01 


2.7804E-05 


2.4773E+00 


17 


1.0000E-06 


19 


9 


9 


23 


9 


23 


9 


3.5812E-08 


1.0077E+00 


2.2023E-07 


2.4781E+00 


22 


1.0000E-08 


23 


13 


14 


31 


16 


31 


14 


1.2344E-10 


1.0063E+00 


1.2046E-09 


2.4782E+00 


28 


1.0000E-10 


29 


19 


15 


35 


15 


35 


15 


6.6478E-12 


1.0083E+00 


7.4462E-11 


2.4783E+00 


42 


1.0000E-12 


29 


19 


18 


39 


21 


39 


18 


4.9252E-13 


1.0063E+00 


5.3570E-12 


2.4783E+00 


45 


1.0000E-13 



TABLE XL Solution of the HRWE on the glued cylinder subregion G. The reported 
truncations N} and N\ were also used for cylinders 2,3,4, and 5. 



n = 0.1 


MPSPD 


L2 error 


L2 norm 


Loo error 


Loo norm 


iterations 


tolerance 


15.7 


3.7532E-06 


7.0509E-01 


9.9579E-05 


3.6556E+00 


5 


1.0000E-05 


23.9 


4.2440E-08 


7.8382E-01 


5.8222E-07 


3.6563E+00 


3 


1.0000E-07 


31.0 


2.6333E-10 


8.3492E-01 


4.0406E-09 


3.6564E+00 


3 


1.0000E-09 


37.2 


4.1855E-12 


9.3982E-01 


8.6696E-11 


3.6565E+00 


3 


1.0000E-11 


37.9 


4.7733E-13 


9.5252E-01 


8.2254E-12 


3.6565E+00 


2 


1.0000E-12 



TABLE XII. Solution of the HRWE on the 2-center multidomain V. Here MPSPD 
stands for modes per subdomain per dimension. Note that an MPSPD of 37.9 corresponds to 
(11 subdomains) x (37. 9 3 ) ~ 599000 unknowns. 



spectral formulation of the PDE problem with consistent incorporation of boundary condi- 
tions. However, particularly in higher dimensional settings, an integration "preconditioner" 
may not be an optimal approximate inverse in any known sense; as a result the technique 
would not seem practical in and of itself. Here we mean that, for a higher dimensional prob- 
lem like ours, the sole use of integration preconditioning will likely lead to prohibitively large 
iteration counts when using Krylov methods and/or loss of accuracy due to poor condition- 
ing. At least for our problem, we have demonstrated that both issues may be surmounted by 
further preconditioning. In particular, studying our problem on a given subdomain (spec- 
tral element), we have empirically demonstrated that block Jacobi preconditioning (with 
each block inverted by LU factorization) is effective for the banded matrix produced by 
integration preconditioning. Moreover, for the matching of subdomains in our multidomain 
approach the alternating Schwarz method (an elementary domain decomposition precondi- 
tioner) works well. Given that little seems known about preconditioning for modal methods, 
whereas preconditioning for nodal methods is well developed, we believe that our demon- 
stration of effective modal preconditioning based on rather standard methods is remarkable. 

In addition to modal preconditioning, other aspects of our work are new from the stand- 
point of modal spectral methods, in particular its multidomain character and focus on a 
mixed-type problem. Ref. jH already presented the outline for applying the integration pre- 
conditioning technique to higher dimensional problems, that is to PDEs. While we have 
carried out and presented the details of such an application, our work has gone further in 
developing a 3D multidomain version of the technique (Ref. [g] consider the multidomain 
case in 2D). In particular, we have presented the details of gluing constituent subdomains, 
and how this gluing is reflected in the overall linear system. As another new, and un- 
usual, aspect, our work is the first successful application of integration preconditioning to a 
three dimensional mixed-type problem, a problem with both elliptic and hyperbolic regions. 
Whence it has numerically confirmed once more (cf. BHHQ) that such problems can 
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be well-posed; see [44| for a theoretical discussion. The use of a multidomain decomposition 
is of special interest for mixed problems like ours, since the type change need not occur in all 
subdomains. Indeed, for our example, it occurs on a cylinder that intersects only the outer 
spherical shell. When the nonlinearities of the actual physical problem are included, this 
feature of our domain decomposition may prove useful, because the true physical equations 
will be only mildly nonlinear on the outer shell, with the strongest nonlinearities confined to 
subdomains on which the equations are elliptic. Our work therefore suggests that we might 
treat the outer shell differently from the inner subdomains when solving the full nonlinear 
problem. 



B. Outlook 

While we have demonstrated that our mix of methods delivers efficiency and remarkable 
accuracy when applied to a nontrivial 3D model problem, a number of issues merit further 
investigation. These include both particular challenges in the application of this paper's 
methods to helically symmetric general relativistic binary fields (the problem of our interest), 
and numerical analysis questions pertaining to integration preconditioning as a method for 
more general problems. 

The numerical analysis issues center on the value of integration preconditioning, or spar- 
sification, in the solution of higher dimensional PDEs, particularly in the context of a mul- 
tidomain approach. Here we have applied the method to only one linear PDE, with an 
empirical demonstration of its success. For any given linear equation, a fuller investigation 
of integration sparsification for multidomain scenarios would focus on the interplay between 
condition number, field of values (Rayleigh quotients), and computational efficiency (iter- 
ation counts). All of these issues would be examined both before and after some form of 
"ordinary" preconditioning, e.g. the combination of b\ock-LU and alternating Schwarz pre- 
conditioning used in this paper. The sparse matrices produced by integration sparsification 
allow for quicker matrix multiplies in a Krylov method like GMRES. Our work suggests that 
this advantage might be gained without large iteration counts, but the issue deserves more 
careful study. The efficient treatment of nonlinearities is also worthy of investigation, and 
any such study would build upon the results already given in Ref. At present, we are 
in process of evaluating integration sparsification in the context of these issues, mostly with 
2D model problems. 

Several challenges remain if we are to apply some variant of our method to the problem 
of helically symmetric general relativistic binary fields. First, we must test the efficiency of 
our method in solving a nonlinear HRWE. In practice, this should not be a problem. The 
strongest nonlinearities will occur closest to the black hole sources, i.e. near the surfaces on 
which the inner boundary conditions are set. By choosing these boundaries some distance 
from the sources, we can, at the cost of accuracy in mathematically representing the physical 
problem, reduce the severity of the nonlinearities. The real question, then, is not whether we 
can handle nonlinearities, but how close to the sources the inner boundaries can be placed. 
Second, we must replace the outgoing radiative boundary conditions with "standing wave 



boundary conditions," as described in Ref. [25j. This change is straightforward in a linearized 



problem, and, as explained in Ref. [25|], should not pose great difficulty in nonlinear general 
relativity. Third, we must move from the scalar problem considered here to the actual tensor 
problem. Solution of the helically symmetric problem of a binary in full general relativity 
will require all the information in the tensorial fields, and the coupling of those fields. This 
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proved to be the greatest challenge for the solution method presented in Ref. 29], and it 
severely limited the achievable accuracy. We are confident that the method described in this 
paper will deliver the accuracy needed to find useful solutions. 

The methods developed here have been motivated by the problem of binary inspiral in 
general relativity. However, our methods may find broader use; they might be applied to 
problems distinct from the helically symmetric mixed PDEs of the periodic standing wave 
approximation. As a salient example, multidomain spectral methods are already being used 



in the elliptical problem of generating binary black hole initial data [30j, |3jj. Our set of 



methods, with integration sparsification, might be used as an alternative approach. 
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Appendix A: Explicit solution for a point source. 

This appendix presents two representations for an exact solution to the HRWE, namely 
the solution for a point source in a circular orbit. Superposition of two such solutions 
yields the binary field exploited in our numerical tests. As before, let (x, y,z — z) — 
(r sin 9 cos (p, r sin 9 sin ip,r cos 9) represent the comoving Cartesian coordinates, where ip = 
cf) — Qt. In terms of the comoving coordinates, we define the Laplacian V 2 = <9| + <9~ + <9 2 
and dp operators and consider the inhomogeneous HRWE 

(V 2 - Q 2 d 2 ^ = -47r^^(cosfl)% - y ), (Al) 

where (a,Tr/2,<p ) specifies the location of the source point in the spherical polar system 
associated with (x,y,z). We set <p = 0, since this shift can always be reinserted via the 
replacement ip — > <p — ipo in the representations (1A2I) and ( 1A11I) given below. Using standard 
methods of separation of variables and one-dimensional Green's functions, we find the series 
representation for a particular solution to PAID . 

oo n e 

1 _ , „ s _ ,_ % r< 



iP(x,y,z) = 2^— — P, (cos^)P,o(0) ; 



1=0 Zl+ L r > 



— 4f2 mPi m (cos9)Pi m (0)ji(rnfi,r < ) [n^(mf)r>) cos(my?) + ^(mfir>) sin (my)] . 

(A2) 



1=1 m=l 



Here Pt m (u) is a normalized associated Legendre function, a nd j i{z) and rigiz) are respec- 
tively spherical Bessel functions of the first and second kind [38J. Moreover, we adopt the 
standard notations r< = min(a, r) and r> = max(a, r). The series is poorly convergent near 
r = a; however, for Q <^ 1 it converges rapidly for r ^> a. 
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To derive a separate representation of the same series which can be used near r = a, we 
consider the equivalent problem for the 3+1 wave equation written in inertial, rather than 
comoving coordinates, 

(V 2 - <9 2 )^ = -47r ^ (r 7 a ^ (cosg)(5(0 - Qt). (A3) 

In the inertial frame the source point has the time-dependent location 

£(t) = acos(£lt)e x + a sin (fit) e y . (A4) 

Therefore, we wish to find the retarded solution to 

(V 2 - <9 2 )^ = -4tt5( 3) (x - £(*)), (A5) 

and then evaluate it at the field point 

x(t) = ze z + p cos(0 + Qt)e x + p sin(0 + Qt)e y , (A6) 

where p 2 = x 2 + y 2 = x 2 + y 2 . Notice that the evaluation point x(£) rotates with the source; 
whence this latter evaluation will effectively remove the time dependence. The retarded-time 
Green's function for the wave operator is 

G r ct(t,x;t',x') = - ^ (t -f- |X ~ X/|) , (A7) 

47T |X — X'| 

and it obeys 

(V 2 - d 2 )G rct (t, x; t', x') = -6(t - t')6® (x - x') . (A8) 

We obtain the desired solution to (1A5|) via spacetime convolution of 4ir5( 3 \£(t)) with 
G Tf .±(t, x; t' xQ. The details of this calculation are given in the textbook by Matthews and 
Walker [43], and the result is 

tf(t,x) = J- , (A9) 

|x •(*-*(*))' 

where !• denotes the derivative of £(£') with respect to its argument. Here t' is retarded time 
which obeys t — t' = |x — £(t')\. Setting x = x(t), we find 

(t - t') = [z 2 + p 2 + a 2 - 2ap cos(0 - fit')] V2 

= [z 2 + p 2 + a 2 - 2ap cos(<p + Q{t - t'))] 1/2 . (A10) 

With (p, z, if) near (a, 0, 0), we may use the last formula to numerically compute t — t' via 
fixed-point iteration. Finally, since ip(x,y,z) = $?(t, x(t)), we then have 

${x, y, z) = —j- (All) 

[z 2 + p 2 + a 2 - lap cos(<p + fi(t - f ))] ' - apVL sin(<^ + - t')) 
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as a concrete expression for ( 1A9I) . With both Eqs. (1A2I) and (lAllf) at our disposal, we 
can evaluate the retarded solution to (1A1|) with enough accuracy (uniformly over the entire 
2-center domain) to make the comparisons reported in Sec. IIVI 
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